Skip to content

[test_pfcwd_timer_accuracy] handle exception during grep log#16446

Merged
StormLiangMS merged 1 commit intosonic-net:masterfrom
lipxu:20250110_public_pfcwd_accuracy
Jan 13, 2025
Merged

[test_pfcwd_timer_accuracy] handle exception during grep log#16446
StormLiangMS merged 1 commit intosonic-net:masterfrom
lipxu:20250110_public_pfcwd_accuracy

Conversation

@lipxu
Copy link
Copy Markdown
Contributor

@lipxu lipxu commented Jan 10, 2025

Description of PR

Summary:
Fixes # (issue)
29970341

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • New Test case
    • Skipped for non-supported platforms
    • Add ownership here(Microsft required only)
  • Test case improvement

Back port request

  • 202012
  • 202205
  • 202305
  • 202311
  • 202405
  • 202411

Approach

What is the motivation for this PR?

Since the test environment limitation, sometimes, pfcwd could not be triggered expectedly.
It would cause exception failure due to grep pfc watchdog syslog return failure.

How did you do it?

Handle the exception, return failure instead of raise an exception.

How did you verify/test it?

inject an failure and run the case locally.

pfcwd/test_pfcwd_timer_accuracy.py::TestPfcwdAllTimer::test_pfcwd_timer_accuracy[bjw-can-7050qx-1] 
---------------------------------------------------------------------------------------------------------------------- live log call -----------------------------------------------------------------------------------------------------------------------
05:57:06 test_pfcwd_timer_accuracy.retrieve_times L0371 WARNING| ##### Randomly skip timestamp 05:56:16.168697 parsing for pattern: [d]etected PFC storm
05:57:26 test_pfcwd_timer_accuracy.run_test       L0228 WARNING| storm_start_ms 1736488575601 or storm_detect_ms 0 or storm_end_ms 1736488588756 or storm_restore_ms 1736488589364 is 0
05:57:26 test_pfcwd_timer_accuracy.run_test       L0232 WARNING| Skip this loop due to missing timestamps
06:16:58 test_pfcwd_timer_accuracy.retrieve_times L0371 WARNING| ##### Randomly skip timestamp 06:16:07.324006 parsing for pattern: [d]etected PFC storm
06:17:17 test_pfcwd_timer_accuracy.run_test       L0228 WARNING| storm_start_ms 1736489766762 or storm_detect_ms 0 or storm_end_ms 1736489779979 or storm_restore_ms 1736489780540 is 0
06:17:17 test_pfcwd_timer_accuracy.run_test       L0232 WARNING| Skip this loop due to missing timestamps
06:20:04 test_pfcwd_timer_accuracy.retrieve_times L0371 WARNING| ##### Randomly skip timestamp 06:19:08.259521 parsing for pattern: [P]FC_STORM_END
06:20:06 test_pfcwd_timer_accuracy.run_test       L0228 WARNING| storm_start_ms 1736489935710 or storm_detect_ms 1736489936402 or storm_end_ms 0 or storm_restore_ms 1736489948823 is 0
06:20:06 test_pfcwd_timer_accuracy.run_test       L0232 WARNING| Skip this loop due to missing timestamps
06:21:30 test_pfcwd_timer_accuracy.retrieve_times L0371 WARNING| ##### Randomly skip timestamp 06:20:36.569082 parsing for pattern: [s]torm restored
06:21:30 test_pfcwd_timer_accuracy.run_test       L0228 WARNING| storm_start_ms 1736490020022 or storm_detect_ms 1736490020552 or storm_end_ms 1736490035837 or storm_restore_ms 0 is 0
06:21:30 test_pfcwd_timer_accuracy.run_test       L0232 WARNING| Skip this loop due to missing timestamps
PASSED   

Any platform specific information?

Supported testbed topology if it's a new test case?

Documentation

@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@lipxu lipxu requested review from StormLiangMS and cyw233 January 10, 2025 07:23
Copy link
Copy Markdown
Collaborator

@StormLiangMS StormLiangMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mssonicbld
Copy link
Copy Markdown
Collaborator

@lipxu PR conflicts with 202405 branch

@mssonicbld
Copy link
Copy Markdown
Collaborator

@lipxu PR conflicts with 202411 branch

@lipxu
Copy link
Copy Markdown
Contributor Author

lipxu commented Jan 13, 2025

202405
#16490

@lipxu
Copy link
Copy Markdown
Contributor Author

lipxu commented Jan 13, 2025

202411
#16491

StormLiangMS pushed a commit that referenced this pull request Jan 15, 2025
… grep log #16490

What is the motivation for this PR?
#16446
merged PR to 202405 conflict

How did you do it?
manually merge

# Regular expressions for the two timestamp formats
regex1 = re.compile(r'^[A-Za-z]{3} \d{2} \d{2}:\d{2}:\d{2}\.\d{6}')
regex2 = re.compile(r'^\d{4} [A-Za-z]{3} \d{2} \d{2}:\d{2}:\d{2}\.\d{6}')
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lipxu is there any reasons why we're switching to regex here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, @auspham , just saw the comments. sorry for the late response.
I used regex here because I initially aimed to standardize the handling function by searching for the timestamp and processing it with the same function, however, during UT test, I found the date command does not support the format with the year. therefore, I reverted to the previous handling process but retained the regex change here. thanks.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can only search the timestamp without year, let me try

yejianquan pushed a commit that referenced this pull request Feb 3, 2025
Description of PR
Summary:
Fixes # (issue) 31202914

From #16446 we are changing to use regex to capture the timestamp for pfc timer accuracy. However, the regex does not cover the scenario where date digit is 1. For example: 2025 Jan  5 00:53:14.103188 

Approach
What is the motivation for this PR?
Comparing between:

2025 Jan 26 17:36:31.27789
2025 Feb  2 15:17:24.055182
We can see that we might need up to 2 spaces after month and the number of day digits can vary between 1 to 2.

How did you do it?
Adjust the regex so it matches for both format

How did you verify/test it?
image

Signed-off-by: Austin Pham <austinpham@microsoft.com>
mssonicbld pushed a commit to mssonicbld/sonic-mgmt that referenced this pull request Feb 3, 2025
Description of PR
Summary:
Fixes # (issue) 31202914

From sonic-net#16446 we are changing to use regex to capture the timestamp for pfc timer accuracy. However, the regex does not cover the scenario where date digit is 1. For example: 2025 Jan  5 00:53:14.103188 

Approach
What is the motivation for this PR?
Comparing between:

2025 Jan 26 17:36:31.27789
2025 Feb  2 15:17:24.055182
We can see that we might need up to 2 spaces after month and the number of day digits can vary between 1 to 2.

How did you do it?
Adjust the regex so it matches for both format

How did you verify/test it?
image

Signed-off-by: Austin Pham <austinpham@microsoft.com>
mssonicbld pushed a commit that referenced this pull request Feb 3, 2025
Description of PR
Summary:
Fixes # (issue) 31202914

From #16446 we are changing to use regex to capture the timestamp for pfc timer accuracy. However, the regex does not cover the scenario where date digit is 1. For example: 2025 Jan  5 00:53:14.103188 

Approach
What is the motivation for this PR?
Comparing between:

2025 Jan 26 17:36:31.27789
2025 Feb  2 15:17:24.055182
We can see that we might need up to 2 spaces after month and the number of day digits can vary between 1 to 2.

How did you do it?
Adjust the regex so it matches for both format

How did you verify/test it?
image

Signed-off-by: Austin Pham <austinpham@microsoft.com>
@bingwang-ms
Copy link
Copy Markdown
Collaborator

@lipxu We saw the error below recently. Can you help check if the error is caused by this change?

        err_msg = ("Real detection time is greater than configured: Real detect time: {} "
>                  "Expected: {} (wd_detect_time + wd_poll_time)".format(self.all_detect_time[check_point],
                                                                         config_detect_time))
E       IndexError: list index out of range

mssonicbld pushed a commit to mssonicbld/sonic-mgmt that referenced this pull request Feb 11, 2025
Description of PR
Summary:
Fixes # (issue) 31202914

From sonic-net#16446 we are changing to use regex to capture the timestamp for pfc timer accuracy. However, the regex does not cover the scenario where date digit is 1. For example: 2025 Jan  5 00:53:14.103188 

Approach
What is the motivation for this PR?
Comparing between:

2025 Jan 26 17:36:31.27789
2025 Feb  2 15:17:24.055182
We can see that we might need up to 2 spaces after month and the number of day digits can vary between 1 to 2.

How did you do it?
Adjust the regex so it matches for both format

How did you verify/test it?
image

Signed-off-by: Austin Pham <austinpham@microsoft.com>
mssonicbld pushed a commit that referenced this pull request Feb 11, 2025
Description of PR
Summary:
Fixes # (issue) 31202914

From #16446 we are changing to use regex to capture the timestamp for pfc timer accuracy. However, the regex does not cover the scenario where date digit is 1. For example: 2025 Jan  5 00:53:14.103188 

Approach
What is the motivation for this PR?
Comparing between:

2025 Jan 26 17:36:31.27789
2025 Feb  2 15:17:24.055182
We can see that we might need up to 2 spaces after month and the number of day digits can vary between 1 to 2.

How did you do it?
Adjust the regex so it matches for both format

How did you verify/test it?
image

Signed-off-by: Austin Pham <austinpham@microsoft.com>
@lipxu
Copy link
Copy Markdown
Contributor Author

lipxu commented Feb 11, 2025

@lipxu We saw the error below recently. Can you help check if the error is caused by this change?

        err_msg = ("Real detection time is greater than configured: Real detect time: {} "
>                  "Expected: {} (wd_detect_time + wd_poll_time)".format(self.all_detect_time[check_point],
                                                                         config_detect_time))
E       IndexError: list index out of range

Hi, @bingwang-ms , Yes, should be related to the timestamp parser, it fixed by the PR #16757, please let me know if we still hit the issue with this fix, thanks

wangxin pushed a commit to wangxin/sonic-mgmt that referenced this pull request Feb 21, 2025
<!--
Please make sure you've read and understood our contributing guidelines;
https://github.com/sonic-net/SONiC/blob/gh-pages/CONTRIBUTING.md

Please provide following information to help code review process a bit easier:
-->
### Description of PR
<!--
- Please include a summary of the change and which issue is fixed.
- Please also include relevant motivation and context. Where should reviewer start? background context?
- List any dependencies that are required for this change.
-->

Summary:
Fixes # (issue) 31202914

From sonic-net#16446 we are changing to use regex to capture the timestamp for pfc timer accuracy. However, the regex does not cover the scenario where date digit is 1. For example: `2025 Jan  5 00:53:14.103188 `

### Type of change

<!--
- Fill x for your type of change.
- e.g.
- [x] Bug fix
-->

- [ ] Bug fix
- [ ] Testbed and Framework(new/improvement)
- [ ] New Test case
    - [ ] Skipped for non-supported platforms
- [ ] Test case improvement

### Back port request
- [ ] 202012
- [ ] 202205
- [ ] 202305
- [ ] 202311
- [ ] 202405
- [ ] 202411

### Approach
#### What is the motivation for this PR?
Comparing between:
- `2025 Jan 26 17:36:31.27789`
- `2025 Feb  2 15:17:24.055182`

We can see that we might need up to 2 spaces after month and the number of day digits can vary between 1 to 2.
#### How did you do it?
Adjust the regex so it matches for both format

#### How did you verify/test it?

![image](https://github.com/user-attachments/assets/31a72bdd-bed8-4b01-80d9-fc2e8366ac2c)

#### Any platform specific information?

#### Supported testbed topology if it's a new test case?

### Documentation
<!--
(If it's a new feature, new test case)
Did you update documentation/Wiki relevant to your implementation?
Link to the wiki page?
-->
nnelluri-cisco pushed a commit to nnelluri-cisco/sonic-mgmt that referenced this pull request Mar 15, 2025
…et#16446)

What is the motivation for this PR?
Since the test environment limitation, sometimes, pfcwd could not be triggered expectedly.
It would cause exception failure due to grep pfc watchdog syslog return failure.

How did you do it?
Handle the exception, return failure instead of raise an exception.

How did you verify/test it?
inject an failure and run the case locally.

pfcwd/test_pfcwd_timer_accuracy.py::TestPfcwdAllTimer::test_pfcwd_timer_accuracy[bjw-can-7050qx-1] 
---------------------------------------------------------------------------------------------------------------------- live log call -----------------------------------------------------------------------------------------------------------------------
05:57:06 test_pfcwd_timer_accuracy.retrieve_times L0371 WARNING| ##### Randomly skip timestamp 05:56:16.168697 parsing for pattern: [d]etected PFC storm
05:57:26 test_pfcwd_timer_accuracy.run_test       L0228 WARNING| storm_start_ms 1736488575601 or storm_detect_ms 0 or storm_end_ms 1736488588756 or storm_restore_ms 1736488589364 is 0
05:57:26 test_pfcwd_timer_accuracy.run_test       L0232 WARNING| Skip this loop due to missing timestamps
06:16:58 test_pfcwd_timer_accuracy.retrieve_times L0371 WARNING| ##### Randomly skip timestamp 06:16:07.324006 parsing for pattern: [d]etected PFC storm
06:17:17 test_pfcwd_timer_accuracy.run_test       L0228 WARNING| storm_start_ms 1736489766762 or storm_detect_ms 0 or storm_end_ms 1736489779979 or storm_restore_ms 1736489780540 is 0
06:17:17 test_pfcwd_timer_accuracy.run_test       L0232 WARNING| Skip this loop due to missing timestamps
06:20:04 test_pfcwd_timer_accuracy.retrieve_times L0371 WARNING| ##### Randomly skip timestamp 06:19:08.259521 parsing for pattern: [P]FC_STORM_END
06:20:06 test_pfcwd_timer_accuracy.run_test       L0228 WARNING| storm_start_ms 1736489935710 or storm_detect_ms 1736489936402 or storm_end_ms 0 or storm_restore_ms 1736489948823 is 0
06:20:06 test_pfcwd_timer_accuracy.run_test       L0232 WARNING| Skip this loop due to missing timestamps
06:21:30 test_pfcwd_timer_accuracy.retrieve_times L0371 WARNING| ##### Randomly skip timestamp 06:20:36.569082 parsing for pattern: [s]torm restored
06:21:30 test_pfcwd_timer_accuracy.run_test       L0228 WARNING| storm_start_ms 1736490020022 or storm_detect_ms 1736490020552 or storm_end_ms 1736490035837 or storm_restore_ms 0 is 0
06:21:30 test_pfcwd_timer_accuracy.run_test       L0232 WARNING| Skip this loop due to missing timestamps
PASSED   

Any platform specific information?
nnelluri-cisco pushed a commit to nnelluri-cisco/sonic-mgmt that referenced this pull request Mar 15, 2025
Description of PR
Summary:
Fixes # (issue) 31202914

From sonic-net#16446 we are changing to use regex to capture the timestamp for pfc timer accuracy. However, the regex does not cover the scenario where date digit is 1. For example: 2025 Jan  5 00:53:14.103188 

Approach
What is the motivation for this PR?
Comparing between:

2025 Jan 26 17:36:31.27789
2025 Feb  2 15:17:24.055182
We can see that we might need up to 2 spaces after month and the number of day digits can vary between 1 to 2.

How did you do it?
Adjust the regex so it matches for both format

How did you verify/test it?
image

Signed-off-by: Austin Pham <austinpham@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants